567 research outputs found

    On the sphericity test with large-dimensional observations

    Get PDF
    In this paper, we propose corrections to the likelihood ratio test and John's test for sphericity in large-dimensions. New formulas for the limiting parameters in the CLT for linear spectral statistics of sample covariance matrices with general fourth moments are first established. Using these formulas, we derive the asymptotic distribution of the two proposed test statistics under the null. These asymptotics are valid for general population, i.e. not necessarily Gaussian, provided a finite fourth-moment. Extensive Monte-Carlo experiments are conducted to assess the quality of these tests with a comparison to several existing methods from the literature. Moreover, we also obtain their asymptotic power functions under the alternative of a spiked population model as a specific alternative.Comment: 37 pages, 3 figure

    On singular values distribution of a large auto-covariance matrix in the ultra-dimensional regime

    Get PDF
    Let (εt)t>0(\varepsilon_{t})_{t>0} be a sequence of independent real random vectors of pp-dimension and let XT=t=s+1s+TεtεtsT/TX_T=\sum_{t=s+1}^{s+T}\varepsilon_t\varepsilon^T_{t-s}/T be the lag-ss (ss is a fixed positive integer) auto-covariance matrix of εt\varepsilon_t. This paper investigates the limiting behavior of the singular values of XTX_T under the so-called {\em ultra-dimensional regime} where pp\to\infty and TT\to\infty in a related way such that p/T0p/T\to 0. First, we show that the singular value distribution of XTX_T after a suitable normalization converges to a nonrandom limit GG (quarter law) under the forth-moment condition. Second, we establish the convergence of its largest singular value to the right edge of GG. Both results are derived using the moment method.Comment: 32 pages, 2 figure

    On Two Simple and Effective Procedures for High Dimensional Classification of General Populations

    Get PDF
    In this paper, we generalize two criteria, the determinant-based and trace-based criteria proposed by Saranadasa (1993), to general populations for high dimensional classification. These two criteria compare some distances between a new observation and several different known groups. The determinant-based criterion performs well for correlated variables by integrating the covariance structure and is competitive to many other existing rules. The criterion however requires the measurement dimension be smaller than the sample size. The trace-based criterion in contrast, is an independence rule and effective in the "large dimension-small sample size" scenario. An appealing property of these two criteria is that their implementation is straightforward and there is no need for preliminary variable selection or use of turning parameters. Their asymptotic misclassification probabilities are derived using the theory of large dimensional random matrices. Their competitive performances are illustrated by intensive Monte Carlo experiments and a real data analysis.Comment: 5 figures; 22 pages. To appear in "Statistical Papers

    Testing the Sphericity of a covariance matrix when the dimension is much larger than the sample size

    Get PDF
    This paper focuses on the prominent sphericity test when the dimension pp is much lager than sample size nn. The classical likelihood ratio test(LRT) is no longer applicable when pnp\gg n. Therefore a Quasi-LRT is proposed and asymptotic distribution of the test statistic under the null when p/n,np/n\rightarrow\infty, n\rightarrow\infty is well established in this paper. Meanwhile, John's test has been found to possess the powerful {\it dimension-proof} property, which keeps exactly the same limiting distribution under the null with any (n,p)(n,p)-asymptotic, i.e. p/n[0,]p/n\rightarrow[0,\infty], nn\rightarrow\infty. All asymptotic results are derived for general population with finite fourth order moment. Numerical experiments are implemented for comparison

    Gaussian fluctuations for linear spectral statistics of large random covariance matrices

    Get PDF
    Consider a N×nN\times n matrix Σn=1nRn1/2Xn\Sigma_n=\frac{1}{\sqrt{n}}R_n^{1/2}X_n, where RnR_n is a nonnegative definite Hermitian matrix and XnX_n is a random matrix with i.i.d. real or complex standardized entries. The fluctuations of the linear statistics of the eigenvalues Tracef(ΣnΣn)=i=1Nf(λi),(λi) eigenvalues of ΣnΣn,\operatorname {Trace}f \bigl(\Sigma_n\Sigma_n^*\bigr)=\sum_{i=1}^Nf(\lambda_i),\qquad (\lambda_i)\ eigenvalues\ of\ \Sigma_n\Sigma_n^*, are shown to be Gaussian, in the regime where both dimensions of matrix Σn\Sigma_n go to infinity at the same pace and in the case where ff is of class C3C^3, that is, has three continuous derivatives. The main improvements with respect to Bai and Silverstein's CLT [Ann. Probab. 32 (2004) 553-605] are twofold: First, we consider general entries with finite fourth moment, but whose fourth cumulant is nonnull, that is, whose fourth moment may differ from the moment of a (real or complex) Gaussian random variable. As a consequence, extra terms proportional to V2=E(X11n)22 \vert \mathcal{V}\vert ^2=\bigl|\mathbb{E}\bigl(X_{11}^n\bigr) ^2\bigr|^2 and κ=EX11n4V22\kappa=\mathbb{E}\bigl \vert X_{11}^n\bigr \vert ^4-\vert {\mathcal{V}}\vert ^2-2 appear in the limiting variance and in the limiting bias, which not only depend on the spectrum of matrix RnR_n but also on its eigenvectors. Second, we relax the analyticity assumption over ff by representing the linear statistics with the help of Helffer-Sj\"{o}strand's formula. The CLT is expressed in terms of vanishing L\'{e}vy-Prohorov distance between the linear statistics' distribution and a Gaussian probability distribution, the mean and the variance of which depend upon NN and nn and may not converge.Comment: Published at http://dx.doi.org/10.1214/15-AAP1135 in the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Modeling extreme values of processes observed at irregular time steps: Application to significant wave height

    Get PDF
    This work is motivated by the analysis of the extremal behavior of buoy and satellite data describing wave conditions in the North Atlantic Ocean. The available data sets consist of time series of significant wave height (Hs) with irregular time sampling. In such a situation, the usual statistical methods for analyzing extreme values cannot be used directly. The method proposed in this paper is an extension of the peaks over threshold (POT) method, where the distribution of a process above a high threshold is approximated by a max-stable process whose parameters are estimated by maximizing a composite likelihood function. The efficiency of the proposed method is assessed on an extensive set of simulated data. It is shown, in particular, that the method is able to describe the extremal behavior of several common time series models with regular or irregular time sampling. The method is then used to analyze Hs data in the North Atlantic Ocean. The results indicate that it is possible to derive realistic estimates of the extremal properties of Hs from satellite data, despite its complex space--time sampling.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS711 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org
    corecore